TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play
نویسنده
چکیده
TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, based on the TD(X) reinforcement learning algorithm (Sutton 1988). Despite starting from random initial weights (and hence random initial strategy), TD-Gammon achieves a surprisingly strong level of play. With zero knowledge built in at the start of learning (i.e., given only a ”raw” description of the board state), the network learns to play at a strong intermediate level. Furthermore, when a set of handcrafted features is added to the network’s input representation, the result is a truly staggering level of performance: the latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world’s best human players.
منابع مشابه
Programming backgammon using self-teaching neural nets
TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results. Starting from random initial play, TD-Gammon’s selfteaching methodology results in a surprisingly strong program: without lookahead, its positional judgement rivals that of human experts, and when combined with shallow lookahead, it reaches a level of pla...
متن کاملFeature Construction for Reinforcement Learning in Hearts
Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search and learning methods can achieve grand-master level play in backgammon. In this work, we develop a player for the game of hearts, a 4-player game, based on stochastic linear regression and TD learning. Using a small ...
متن کاملBackgammon , anyone ? Neural learning theory tested
Madison, Wis. A computer took on a human grand master and won 99 of 100 games at the American Association of Artificial Intelligence meeting held here recently. Though the system used was developed at IBM Corp.'s T.J. Watson Research Center, this time the game was not chess, but backgammon. Also, unlike Deep Blue, the machine playing was not running a conventional computer program, but a neural...
متن کاملDRAFT GP-Gammon: Genetically Programming Backgammon Players
We apply genetic programming to the evolution of strategies for playing the game of backgammon. We explore two different strategies of learning: using a fixed external opponent as teacher, and letting the individuals play against each other. We conclude that the second approach is better and leads to excellent results: Pitted in a 1000-game tournament against a standard benchmark player—Pubeval...
متن کاملWhy did TD-Gammon Work?
Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference learning for other applications or even other games. We were able to replicate some of the success of TD-Gammon, developing a competitive evaluation function on a 4000 parameter feed-forward neural network, without using back-propagation, reinforcement ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Neural Computation
دوره 6 شماره
صفحات -
تاریخ انتشار 1994